Skip to content

[clang][analyzer] Add C standard streams to the internal memory space #147766

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

balazske
Copy link
Collaborator

@balazske balazske commented Jul 9, 2025

If variables stdin, stdout, stderr are added to the system memory space, they are invalidated at any call to system (or C library) functions. These variables are not expected to be changed by system functions, therefore should not belong to the system memory space. This change moves these to the "internal memory space" which is not invalidated at system calls (but is at non-system calls). This eliminates some false positives coming from StreamChecker.

If variables `stdin`, `stdout`, `stderr` are added to the system memory
space, they are invalidated at any call to system (or C library) functions.
These variables are not expected to be changed by system functions,
therefore should not belong to the system memory space.
This change moves these to the "internal memory space" which is not
invalidated at system calls (but is at non-system calls). This
eliminates some false positives coming from StreamChecker.
@balazske balazske requested review from steakhal and NagyDonat July 9, 2025 15:08
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:static analyzer labels Jul 9, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 9, 2025

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-static-analyzer-1

Author: Balázs Kéri (balazske)

Changes

If variables stdin, stdout, stderr are added to the system memory space, they are invalidated at any call to system (or C library) functions. These variables are not expected to be changed by system functions, therefore should not belong to the system memory space. This change moves these to the "internal memory space" which is not invalidated at system calls (but is at non-system calls). This eliminates some false positives coming from StreamChecker.


Full diff: https://github.com/llvm/llvm-project/pull/147766.diff

2 Files Affected:

  • (modified) clang/lib/StaticAnalyzer/Core/MemRegion.cpp (+17-3)
  • (modified) clang/test/Analysis/stream.c (+42-3)
diff --git a/clang/lib/StaticAnalyzer/Core/MemRegion.cpp b/clang/lib/StaticAnalyzer/Core/MemRegion.cpp
index 63ad567ab7151..4e946f39c1cef 100644
--- a/clang/lib/StaticAnalyzer/Core/MemRegion.cpp
+++ b/clang/lib/StaticAnalyzer/Core/MemRegion.cpp
@@ -1054,10 +1054,24 @@ const VarRegion *MemRegionManager::getVarRegion(const VarDecl *D,
     assert(!Ty.isNull());
     if (Ty.isConstQualified()) {
       sReg = getGlobalsRegion(MemRegion::GlobalImmutableSpaceRegionKind);
-    } else if (Ctx.getSourceManager().isInSystemHeader(D->getLocation())) {
-      sReg = getGlobalsRegion(MemRegion::GlobalSystemSpaceRegionKind);
     } else {
-      sReg = getGlobalsRegion(MemRegion::GlobalInternalSpaceRegionKind);
+      StringRef N = D->getNameAsString();
+      QualType FILETy = D->getASTContext().getFILEType();
+      if (!FILETy.isNull())
+        FILETy = FILETy.getCanonicalType();
+      Ty = Ty.getCanonicalType();
+      bool IsStdStreamVar = Ty->isPointerType() &&
+                            Ty->getPointeeType() == FILETy &&
+                            (N == "stdin" || N == "stdout" || N == "stderr");
+      // Pointer value of C standard streams is usually not modified by system
+      // calls. This means they should not get invalidated at system calls and
+      // can not belong to the system memory space.
+      if (Ctx.getSourceManager().isInSystemHeader(D->getLocation()) &&
+          !IsStdStreamVar) {
+        sReg = getGlobalsRegion(MemRegion::GlobalSystemSpaceRegionKind);
+      } else {
+        sReg = getGlobalsRegion(MemRegion::GlobalInternalSpaceRegionKind);
+      }
     }
 
   // Finally handle static locals.
diff --git a/clang/test/Analysis/stream.c b/clang/test/Analysis/stream.c
index 758b40cca4931..781c023ea248a 100644
--- a/clang/test/Analysis/stream.c
+++ b/clang/test/Analysis/stream.c
@@ -519,14 +519,53 @@ void reopen_std_stream(void) {
   if (!fp) return;
 
   stdout = fp; // Let's make them alias.
-  clang_analyzer_eval(fp == oldStdout);     // expected-warning {{UNKNOWN}}
-  clang_analyzer_eval(fp == stdout);        // expected-warning {{TRUE}} no-FALSE
-  clang_analyzer_eval(oldStdout == stdout); // expected-warning {{UNKNOWN}}
+  clang_analyzer_eval(fp == oldStdout);     // expected-warning {{FALSE}}
+  clang_analyzer_eval(fp == stdout);        // expected-warning {{TRUE}}
+  clang_analyzer_eval(oldStdout == stdout); // expected-warning {{FALSE}}
 }
 
 void only_success_path_does_not_alias_with_stdout(void) {
   if (stdout) return;
   FILE *f = fopen("/tmp/foof", "r"); // no-crash
+  clang_analyzer_eval(f == 0);// expected-warning {{TRUE}} expected-warning {{FALSE}}
   if (!f) return;
   fclose(f);
 }
+
+extern void do_something();
+
+void test_no_invalidate_at_system_call(int use_std) {
+  FILE *fd;
+  char *buf;
+
+  if (use_std) {
+    fd = stdin;
+  } else {
+    if ((fd = fopen("x/y/z", "r")) == NULL)
+      return;
+
+    clang_analyzer_eval(fd == stdin); // expected-warning{{FALSE}}
+    buf = (char *)malloc(100);
+    clang_analyzer_eval(fd == stdin); // expected-warning{{FALSE}}
+  }
+
+  if (fd != stdin)
+    fclose(fd);
+}
+
+void test_invalidate_at_non_system_call(int use_std) {
+  FILE *fd;
+
+  if (use_std) {
+    fd = stdin;
+  } else {
+    if ((fd = fopen("x/y/z", "r")) == NULL)
+      return;
+  }
+
+  do_something();
+  clang_analyzer_eval(fd == stdin); // expected-warning{{UNKNOWN}}
+
+  if (fd != stdin)
+    fclose(fd);
+} // expected-warning{{Opened stream never closed. Potential resource leak}}

@balazske balazske requested a review from gamesh411 July 9, 2025 15:10
Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code change LGTM, my only suggestion is about word choices in a comment.

It's nice to see that this commit cleans up some ugly false positives :)

Comment on lines 1066 to 1068
// Pointer value of C standard streams is usually not modified by system
// calls. This means they should not get invalidated at system calls and
// can not belong to the system memory space.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Pointer value of C standard streams is usually not modified by system
// calls. This means they should not get invalidated at system calls and
// can not belong to the system memory space.
// Pointer value of C standard streams is usually not modified by calls
// to functions declared in system headers. This means that they should
// not get invalidated by calls to functions declared in system headers,
// so they are placed in the global internal space, which is not
// invalidated by calls to functions declared in system headers.

I would prefer avoiding the phrase "system call" because it has a very specific meaning which is not what you speak about here.

Also, I extended the comment to describe the reason why is the global internal space a "better" place for these streams, because I felt that this makes it easier to quickly understand the goals of this logic.

Copy link
Contributor

@steakhal steakhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest rephrasing the PR to something more descriptive: Preserve stdin and friends on system calls

} else {
sReg = getGlobalsRegion(MemRegion::GlobalInternalSpaceRegionKind);
StringRef N = D->getNameAsString();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getNameAsString() returns a std::string prvalue, thus we bind a view to this temporary that gets immediately dangling after the full-expression.

Comment on lines +1059 to +1065
QualType FILETy = D->getASTContext().getFILEType();
if (!FILETy.isNull())
FILETy = FILETy.getCanonicalType();
Ty = Ty.getCanonicalType();
bool IsStdStreamVar = Ty->isPointerType() &&
Ty->getPointeeType() == FILETy &&
(N == "stdin" || N == "stdout" || N == "stderr");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this work is basically done to set the IsStdStreamVar.
This seems pretty complicated and not strictly necessary for getVarRegion itself.
This prompts me if we should hoist this into a helper function that we can call here. WDYT?

Comment on lines +537 to +554
void test_no_invalidate_at_system_call(int use_std) {
FILE *fd;
char *buf;

if (use_std) {
fd = stdin;
} else {
if ((fd = fopen("x/y/z", "r")) == NULL)
return;

clang_analyzer_eval(fd == stdin); // expected-warning{{FALSE}}
buf = (char *)malloc(100);
clang_analyzer_eval(fd == stdin); // expected-warning{{FALSE}}
}

if (fd != stdin)
fclose(fd);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I read this correctly, the test intends to check if the stdin system global would not get invalidated after calling malloc() (or for that matter any other opaque system header fn), right?

Could we simplify this into:

auto orig_stdin = stdin;
free(malloc(10)); // calling some opaque system fns
clang_analyzer_eval(stdin == orig_stdin); // TRUE

Same suggestion goes to the next example too.

@steakhal
Copy link
Contributor

BTW wouldn't this patch break the semantics of freopen? That one should invalidate the file ptr, right?
Could you demonstrate this?

@NagyDonat
Copy link
Contributor

BTW wouldn't this patch break the semantics of freopen? That one should invalidate the file ptr, right? Could you demonstrate this?

No, freopen should not invalidate the file ptr, because it modifies the stream object (associating it with a different file) without changing the pointer identity of the stream object. In fact, if you consider the signature, you can see that in

FILE *freopen(const char *pathname, const char *mode, FILE *stream);

the third parameter takes an rvalue of type FILE *, so if you call e.g. freopen("somefile.txt", "r", stdin) then this cannot invalidate the FILE * pointer stored in the global variable stdin.

@steakhal
Copy link
Contributor

BTW wouldn't this patch break the semantics of freopen? That one should invalidate the file ptr, right? Could you demonstrate this?

No, freopen should not invalidate the file ptr, because it modifies the stream object (associating it with a different file) without changing the pointer identity of the stream object. In fact, if you consider the signature, you can see that in

FILE *freopen(const char *pathname, const char *mode, FILE *stream);

the third parameter takes an rvalue of type FILE *, so if you call e.g. freopen("somefile.txt", "r", stdin) then this cannot invalidate the FILE * pointer stored in the global variable stdin.

makes sense, thx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:static analyzer clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants